Promoter prediction in the human genome

نویسندگان

  • Sridhar Hannenhalli
  • Samuel Levy
چکیده

Computational prediction of eukaryotic poIII promoters has been one of the most elusive problems despite considerable effort devoted to the study. Researchers have looked for various types of signals around the transcriptional start site (TSS), viz. oligo-nucleotide statistics, potential binding sites for core factors, clusters of binding sites, proximity to CpG islands etc. The proximity of CpG islands to gene starts is now a well established fact, although until recently, it was based on very little genomic data. In this work we explore the possibility of enhancing the promoter prediction accuracy by combining CpG island information with a few other, biologically motivated, seemingly independent signals, that cover most of the known knowledge. We benchmarked the method on a much larger genomic datasets compared to previous studies. We were able to improve slightly upon current prediction accuracy. Furthermore, we observe that CpG islands are the most dominant signals and the other signals do not improve the prediction. This suggests that the computational prediction of promoters for genes with no associated CpG-island (typically having tissue-specific expression) looking only at the immediate neighborhood of the TSS may not even be possible. We suggest some biological experiments and studies to better understand the biology of transcription.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational prediction of miRNAs in Nipah virus genome reveals possible interaction with human genes involved in encephalitis

Current re-emergence of Nipah virus (NiV) in India caused 11 deaths so far and many patients were kept in quarantine. A thorough study of previous outbreaks occurred in Malaysia, Bangladesh and India represents cases with high rate of fatality due to acute encephalitis. Our work involves genome analysis of NiV for prediction of miRNAs and their targeted genes in human in order to understand enc...

متن کامل

Genome-wide computational prediction of miRNAs in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) revealed target genes involved in pulmonary vasculature and antiviral innate immunity

The current outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)in China threatened humankind worldwide. The coronaviruses contains the largest RNA genome among all other known RNA viruses, therefore the disease etiology can be understood by analyzing the genome sequence of SARS-CoV-2. In this study, we used an ab-intio based computational tool VMir to scan the complete geno...

متن کامل

Molecular detection of proteolytic activity of human parechovirus 2A protein by gene expression

  Parechoviruses form one of the nine genera in the picornaviridae family, and include two human pathogens: Human parechovirus type1 and 2 (Hpev1 and Hpev2). The genome of picornaviruses encodes a single polyprotein, which undergoes a cleavage cascade performed by virus encoded proteases to give the final virus proteins. The primary cleavage occurs by 2A protein and this step is critical for vi...

متن کامل

Promoter Recognition in human genome

promoter is a specific region of DNA that facilitates the transcription of a particular gene. Promoters are typically located near the genes they regulate, on the same strand and upstream (towards the 5′ region of the sense strand). Promoters contain specific DNA sequences and response elements which provide binding sites for RNA polymerase and for proteins called transcription factors that rec...

متن کامل

Independence of color intensity variation in red flesh apples from the number of repeat units in promoter region of the MdMYB10 gene as an allele to MdMYB1 and MdMYBA

MdMYB10 gene expression results in accumulation of anthocyanin in many tissues including flesh of applefruit. The MdMYB1 and MdMYBA genes are close homologues to MdMYB10 gene and both are responsiblefor red color phenotype in apple fruit skin. In the current study, an apple genome sequence draft analysisindicated that these three genes are located in a unique contig. Further a...

متن کامل

ProSOM: core promoter prediction based on unsupervised clustering of DNA physical profiles

MOTIVATION More and more genomes are being sequenced, and to keep up with the pace of sequencing projects, automated annotation techniques are required. One of the most challenging problems in genome annotation is the identification of the core promoter. Because the identification of the transcription initiation region is such a challenging problem, it is not yet a common practice to integrate ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 17 Suppl 1  شماره 

صفحات  -

تاریخ انتشار 2001